Sequence Models
Sequence-specific deep-learning models live under CSharpNumerics.ML.Sequence.
They keep the existing IModel contract by interpreting each Matrix row as a flattened (timesteps x features) sample.
Currently available models:
CNN1DClassifierinCSharpNumerics.ML.Sequence.Models.ClassificationCNN1DRegressorinCSharpNumerics.ML.Sequence.Models.RegressionLSTMClassifierinCSharpNumerics.ML.Sequence.Models.ClassificationLSTMRegressorinCSharpNumerics.ML.Sequence.Models.RegressionBiLSTMClassifierinCSharpNumerics.ML.Sequence.Models.ClassificationBiLSTMRegressorinCSharpNumerics.ML.Sequence.Models.Regression
Current sequence infrastructure:
ISequenceModelinCSharpNumerics.ML.Sequence.InterfacesConvolutionPaddingModeinCSharpNumerics.ML.Sequence.EnumsConv1DLayerinCSharpNumerics.ML.Sequence.LayersMaxPool1DLayerinCSharpNumerics.ML.Sequence.LayersGlobalAvgPool1DLayerinCSharpNumerics.ML.Sequence.LayersFlattenLayerinCSharpNumerics.ML.Sequence.LayersLSTMLayerinCSharpNumerics.ML.Sequence.LayersBiLSTMLayerinCSharpNumerics.ML.Sequence.Layers
đ CNN1D Architectureâ
Default CNN1D architecture:
Conv1D -> GlobalAvgPool -> Dense(hidden) -> Dense(output)
Optional variants:
UseMaxPooling = trueinsertsMaxPool1DLayerafter the convolution.UseGlobalAveragePooling = falseswitches toFlattenLayerbefore the dense projection.
Shared CNN1D hyperparameters:
TimeStepsFeaturesFiltersKernelSizeConvStridePadding(Same,Valid)UseMaxPoolingPoolSizePoolStrideUseGlobalAveragePoolingHiddenUnitsLearningRateEpochsBatchSizeActivation
Additional regression hyperparameters:
L2
đ LSTM Architectureâ
Default LSTM architecture:
LSTMLayer(returnSequences=false) -> Dense(hidden) -> Dense(output)
The LSTM layer implements the standard four-gate equations (forget, input, output, cell candidate) with full BPTT and gradient clipping. Key features:
- Forget gate bias initialized to
1.0to reduce vanishing gradients - Configurable
ClipNormfor gradient clipping (default:5.0) returnSequences=falseoutputs only the final hidden state
LSTM hyperparameters:
TimeStepsFeaturesHiddenSize- LSTM hidden/cell state dimensionHiddenUnits- optional dense layer after LSTMClipNorm- max gradient norm (default:5.0)LearningRateEpochsBatchSizeActivationL2
âī¸ Bi-LSTM Architectureâ
Default Bi-LSTM architecture:
BiLSTMLayer(returnSequences=false) -> Dense(hidden) -> Dense(output)
The Bi-LSTM layer composes two LSTMLayer instances - one processing the input forwards, one backwards - and concatenates their hidden states per timestep so that output dimension = 2 x HiddenSize.
When returnSequences=false, the output is [h_fwd_T | h_bwd_1].
Bi-LSTM hyperparameters are identical to LSTM (same HiddenSize, ClipNorm, etc.). The dense layer automatically adapts to the 2 x HiddenSize input width.
Example with SupervisedExperiment (CNN1D):
using CSharpNumerics.ML;
using CSharpNumerics.ML.Enums;
using CSharpNumerics.ML.Experiment;
using CSharpNumerics.ML.Sequence.Models.Classification;
var result = SupervisedExperiment
.For(X, y)
.WithGrid(new PipelineGrid()
.AddModel<CNN1DClassifier>(g => g
.Add("TimeSteps", 128)
.Add("Features", 1)
.Add("Filters", 8)
.Add("KernelSize", 5)
.Add("HiddenUnits", 16)
.Add("LearningRate", 0.01)
.Add("Epochs", 200)
.Add("BatchSize", 16)
.Add("Padding", CSharpNumerics.ML.Sequence.Enums.ConvolutionPaddingMode.Same)
.Add("Activation", ActivationType.ReLU)))
.WithCrossValidator(CrossValidatorConfig.KFold(folds: 5))
.Run();
Example with SupervisedExperiment (LSTM):
using CSharpNumerics.ML;
using CSharpNumerics.ML.Enums;
using CSharpNumerics.ML.Experiment;
using CSharpNumerics.ML.Sequence.Models.Classification;
var result = SupervisedExperiment
.For(X, y)
.WithGrid(new PipelineGrid()
.AddModel<LSTMClassifier>(g => g
.Add("TimeSteps", 128)
.Add("Features", 1)
.Add("HiddenSize", 32)
.Add("HiddenUnits", 16)
.Add("LearningRate", 0.001)
.Add("Epochs", 200)
.Add("BatchSize", 16)
.Add("ClipNorm", 5.0)
.Add("Activation", ActivationType.ReLU)))
.WithCrossValidator(CrossValidatorConfig.KFold(folds: 5))
.Run();
đĒ TimeSeries Integration - SequenceDataHelperâ
SequenceDataHelper bridges TimeSeries (from CSharpNumerics.Statistics.Data) to the sequence model pipeline by creating sliding-window samples.
using CSharpNumerics.ML.Sequence;
using CSharpNumerics.Statistics.Data;
// Load a light curve from CSV (columns: Time, Flux, Label)
var ts = TimeSeries.FromCsv("lightcurve.csv");
// Create windows of 128 timesteps, stride 1, using column 1 ("Label") as target
var (X, y) = SequenceDataHelper.CreateWindows(ts, windowSize: 128, labelColumnIndex: 1, stride: 1);
// X shape: [numWindows x 128] (1 feature: Flux)
// y shape: [numWindows] (label from last timestep in each window)
Overloads:
CreateWindows(TimeSeries, windowSize, labelColumnIndex, stride)- extracts features and labels from aTimeSeries, excluding the label column from features.CreateWindows(double[][], double[], windowSize, stride)- works with raw column arrays when labels are computed separately.
đ°ī¸ Exoplanet-Transit Detection Exampleâ
Synthetic Kepler-like light curve -> windowed samples -> CNN1D classification:
using CSharpNumerics.ML;
using CSharpNumerics.ML.Enums;
using CSharpNumerics.ML.Experiment;
using CSharpNumerics.ML.Sequence;
using CSharpNumerics.ML.Sequence.Models.Classification;
using CSharpNumerics.Statistics.Data;
// 1. Build a TimeSeries with flux and transit labels
var ts = new TimeSeries(times, new[] { flux, labels }, new[] { "Flux", "Label" });
// 2. Window into samples
var (X, y) = SequenceDataHelper.CreateWindows(ts, windowSize: 20, labelColumnIndex: 1, stride: 5);
// 3. Train a CNN1DClassifier with grid search
var result = SupervisedExperiment
.For(X, y)
.WithGrid(new PipelineGrid()
.AddModel<CNN1DClassifier>(g => g
.Add("TimeSteps", 20)
.Add("Features", 1)
.Add("Filters", 8)
.Add("KernelSize", 5)
.Add("HiddenUnits", 8)
.Add("LearningRate", 0.02)
.Add("Epochs", 150)
.Add("BatchSize", 16)
.Add("Activation", ActivationType.ReLU)))
.WithCrossValidator(CrossValidatorConfig.KFold(folds: 3))
.Run();
// result.BestScore -> transit detection accuracy
đ§Š Neural Network Building Blocksâ
The neural-network stack now exposes reusable components for sequence-oriented architectures without changing the existing IModel contract.
Reusable dense/activation orchestration remains in CSharpNumerics.ML.NeuralNetwork, while sequence-specific layers and models live under CSharpNumerics.ML.Sequence.
Available infrastructure:
Activationsfor reusable ReLU, Sigmoid, Tanh, Linear, and Softmax transformsILayerfor modular forward/backward layer compositionDenseLayerfor trainable fully connected sequence stepsSequentialModelfor stacking layers with shared forward/backward orchestration
These types are the reusable foundation for both generic feedforward models and the sequence-specific components in CSharpNumerics.ML.Sequence.
Example:
using CSharpNumerics.ML.Enums;
using CSharpNumerics.ML.NeuralNetwork;
using CSharpNumerics.ML.NeuralNetwork.Layers;
using CSharpNumerics.ML.Sequence.Models.Classification;
using CSharpNumerics.Numerics.Objects;
using CSharpNumerics.Numerics.Optimization.SingleObjective;
var model = new SequentialModel(
new DenseLayer(4, 8, ActivationType.ReLU),
new DenseLayer(8, 1, ActivationType.Linear));
var inputSequence = new[]
{
new VectorN(new[] { 0.2, 0.4, 0.6, 0.8 })
};
VectorN prediction = model.ForwardSingle(inputSequence);
VectorN lossGradient = prediction - new VectorN(new[] { 1.0 });
model.BackwardSingle(lossGradient);
model.ApplyGradients(
new GradientDescent(learningRate: 0.01),
new GradientDescent(learningRate: 0.01),
batchSize: 1);
var classifier = new CNN1DClassifier
{
TimeSteps = 128,
Features = 1,
Filters = 8,
KernelSize = 5,
HiddenUnits = 16,
LearningRate = 0.01,
Epochs = 200
};